Combinatorial Approximation Algorithms for MaxCut using 

Random Walks 



We give the first combinatorial approximation algorithm for MaxCut that beats the trivial 
0.5 factor by a constant. The main partitioning procedure is very intuitive, natural, and easily 
described. It essentially performs a number of random walks and aggregates the information to 
provide the partition. We can control the running time to get an approximation factor-running 
time tradeoff. We show that for any constant b > 1.5, there is an 0(n b ) algorithm that outputs 
a (0.5 + (^-approximation for MaxCut, where 5 = 5(b) is some positive constant. 

One of the components of our algorithm is a weak local graph partitioning procedure that 
may be of independent interest. Given a starting vertex i and a conductance parameter (f>, unless 
a random walk of length I = O(logn) starting from i mixes rapidly (in terms of <p and £), we 
can find a cut of conductance at most 4> close to the vertex. The work done per vertex found in 
the cut is sublinear in n. 
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Abstract 



1 Introduction 



The problem of finding the maximum cut of a graph is a classical combinatorial optimization 
problem. Given a graph G = {V,E), with weights Wij on edges {i,j}, the problem is to partition 
the vertex set V into two sets L and R to maximize the weight of cut edges (these have one endpoint 
in L and the other in R). The value of a cut is the total weight of cut edges divided by the total 
weight. The largest possible value of this is MaxCut(G). The problem of computing MaxCut(G) 
was one of Karp's original NP-complete problems |Kar72| . 

Therefore, polynomial-time approximation algorithms for MaxCut were sought out, that would 
provide a cut with value at least oMaxCut((j), for some fixed constant a > 0. It is easy to show 
that a random cut gives a 0.5-approximation for the MaxCut. This was the best known for decades, 
until the seminal paper on semi-definite programming (SDP) by Goemans and Williamson [GW95J. 
They gave a 0.878 . . .-approximation algorithm, which is optimal for polynomial time algorithms 
under the Unique Games Conjecture [Kho02, 1KKMO04] . Arora and Kale [AK07] gave an efficient 
near-linear-time implementation of the SDP algorithm for MaxCut]^] 

In spite of the fact that efficient, possibly optimal, approximation algorithms are known, there 
is a lot of interest in understanding what techniques are required to improve the 0.5-approximation 
factor. By "improve", we mean a ratio of the form 0.5 + 5, for some constant 5 > 0. The 
powerful technique of Linear Programming (LP) relaxations fails to improve the 0.5 factor. Even 
the use of strong LP-hierarchies to tighten relaxations does not help [dlVKM07, STT07J. Recently, 
Trevisan [Tre09] showed for the first time that a technique weaker than SDP relaxations can beat 
the 0.5-factor. He showed that the eigenvector corresponding to the smallest eigenvalue of the 
adjacency matrix can be used to approximate the MaxCut to factor of 0.531. Soto [Sot09] gave 
an improved analysis of the same algorithm that provides a better approximation factor of 0.6142. 
The running timejof this algorithm is 0(n 2 ). 



All the previous algorithms that obtain an approximation factor better than 0.5 are not "com- 
binatorial" , in the sense that they all involve numerical matrix computations such as eigenvector 
computations and matrix exponentiations. It was not known whether combinatorial algorithms 
can beat the 0.5 factor, and indeed, this has been explicitly posed as an open problem by Tre- 
visan jTre09j . Combinatorial algorithms are appealing because they exploit deeper insight into the 
combinatorial structure of the problem, and because they can usually be implemented easily and 
efficiently, typically without numerical round-off issues. 

1.1 Our contributions 

1. In this paper, we achieve this goal of a combinatorial approximation algorithm for MaxCut. 
We analyze a very natural, simple, and combinatorial heuristic for finding the MaxCut of a graph, 
and show that it actually manages to find a cut with an approximation factor strictly greater than 
0.5. In fact, we really have a suite of algorithms: 

Theorem 1.1 For any constant b > 1.5, there is a combinatorial algorithm that runs in 0(n b ) 
time and provides an approximation factor that is a constant greater than 0.5. 

The running time/approximation factor tradeoff curve is shown in Figure [TJ A few representative 
numbers: in 0(n 16 ), 0(n 2 ), and 0(n 3 ) times, we can get approximation factors of 0.5051, 0.5155, 
and 0.5727 respectively. As b becomes large, this converges to the ratio of Trevisan's algorithm. 

1 This was initially only proved for graphs in which the ratio of maximum to average degree was bounded by a 
polylogarithmic factor, but a linear-time reduction due to Trevisan |Tre09j converts any arbitrary graph to this case. 
2 In this paper, we use the O notation to suppress dependence on polylogarithmic factors. 
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2. Even though the core of our algorithm is completely combinatorial, relying only on simple 
random walks and integer operations, the analysis of the algorithm is based on spectral methods. We 
obtain a combinatorial version of Trevisan's algorithm by showing two key facts: (a) the "flipping 
signs" random walks we use corresponds to running the power method on the graph Laplacian, 
and (b) a random starting vertex yields a good starting vector for the power method with constant 
probability. These two facts replace numerical matrix computations with the combinatorial problem 
of estimating certain probabilities, which can be done effectively by sampling and concentration 
bounds. This also allows improved running times since we can selectively find portions of the graph 
and classify them. 

3. A direct application of the partitioning procedure yields an algorithm whose running time is 
0(n 2+tM ). To design the sub-quadratic time algorithm, we have to ensure that the random walks in 
the algorithm mix rapidly. To do this, we design a sort of a local graph partitioning algorithm of 
independent interest based on simple random walks of logarithmic length. Given a starting vertex i, 
either it finds a low conductance cut or certifies that the random walk from i has somewhat mixed, in 
the sense that the ratio of the probability of hitting any vertex j to its probability in the stationary 
distribution is bounded. The work done per vertex output in the cut is sublinear in n. The precise 



statement is given in Theorem 4.1. Previous local partitioning algorithms [ST04, ACL06, AL08 
are more efficient than our procedure, but can only output a low conductance cut, if the actual 
conductance of some set containing i is 0(1/ log n). In this paper, we need to be able to find low 
conductance cuts in more general settings, even if there is no cut of conductance of 0(1/ log n), 
and hence the previous algorithms are unsuitable for our purposes. 



1.2 Related work 

Trevisan |Tre05j also uses random walks to give approximation algorithms for MaxCut (as a 
special case of unique games), although the algorithm only deals with the case when MaxCut 
is 1 — 0(l/poly(logn)). The property tester for bipartiteness in sparse graphs by Goldreich and 
Ron [GR99J is a sublinear time procedure that uses random walks to distinguish graphs where 
MaxCut = 1 from MaxCut < 1 — e. The algorithm, however, does not actually give an approxi- 
mation to MaxCut. There is a similarity in flavor to Dinur's proof of the PCP theorem [Din06], 
which uses random walks and majority votes for gap amplification of CSPs. Our algorithm might 
be seen as some kind of belief propagation, where messages about labels are passed around. 

For the special case of cubic and maximum degree 3 graphs, there has been a study of com- 
binatorial algorithms for MaxCut [BL86, HLZ04, BT08j. These are based on graph theoretic 
properites and very different from our algorithms. Combinatorial algorithms for CSP (constraint 



satisfaction problems) based on LP relaxations have been studied in DFG + 03 . 



2 Algorithm Overview and Intuition 

Let us revisit the greedy algorithm. We currently have a partial cut, where some subset S of the 
vertices have been classified (placed in either side of the cut). We take a new vertex i ^ S and look 
at the edges of i incident to S. In some sense, each such edge provides a "vote" telling i where to 
go. Suppose there is such an edge (i, j), such that j G R. Since we want to cut edges, this edge 
tells i to be placed in L. We place i accordingly to a majority vote, and hence the 0.5 factor. 

Can we take that idea further, and improve on the 0.5 factor? Suppose we fix a source vertex 
i and try to classify vertices with respect to the source. Instead of just looking at edges (or paths 
of length 1), let us look at longer paths. Suppose we choose a length i from some nice distribution 
(say, a binomial distribution with a small expectation) and consider paths of length I from i. If 
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there are many more even length paths to j than odd length paths, we put j in L, otherwise in R. 
This gives a partition of vertices that we can reach, and suggests an algorithm based on random 
walks. We hope to estimate the odd versus even length probabilities through random walks from 
i. This is a very natural idea and elegantly extends the greedy approach. Rather surprisingly, we 
show that this can be used to beat the 0.5 factor by a constant. 

One of the main challenges is to show that we do not need too many walks to distinguish these 
various probabilities. We also need to choose our length carefully. If it is too long, then the odd 
and even path probabilities may become too close to each other. If it is too short, then it may not 
be enough to get sufficient information to beat the greedy approach. 

Suppose the algorithm detects that the probability of going from vertices i to j by an odd 
length path is significantly higher than an even length path. That suggests that we can be fairly 
confident that i and j should be on different sides of the cut. This constitutes the core of our 
algorithm, Threshold. This algorithm classifies some vertices as lying on "odd" or "even" sides 
of the cut based on which probability (odd or even length paths) is significantly higher than the 
other. Significance is decided by a threshold that is a parameter to the algorithm. We show a 
connection between this algorithm and Trevisan's, and then we adapt his (and Soto's) analysis to 
show that one can choose the threshold carefully so that amount of work done per classified vertex 
is bounded, and the number of uncut edges is small. The search for the right threshold is done by 
the Find-threshold algorithm. 

Now, this procedure leaves some vertices unclassified, because no probability is significantly 
larger than the other. We can simply recurse on the unclassified vertices, as long as the the cut we 
obtain is better than the trivial 0.5 approximate cut. This constitutes the Simple algorithm. The 
analysis of this algorithm shows that we can bound the work done per vertex is at most 0(n 1+ ^) 
for any constant fi > 0, and thus the overall running time becomes 0{n 2+ ^). This almost matches 
the running time of Trevisan's algorithm, which runs in 0(n 2 ) time. 

To obtain a sub-quadratic running time, we need to do a more careful analysis of the random 
walks involved. If the random walks do not mix rapidly, or, in other words, tend to remain within 
a small portion of the graph, then we end up classifying only a small number of vertices, even if we 
run a large number of these random walks. This is why we get the 0{n l+ ^) work per vertex ratio. 

But in this case, we can exploit the connection between fast mixing and high conductance [Sin92, 
Mih89, LS90] to conclude that there must be a low conductance cut which accounts for the slow 
mixing rate. To make this algorithmic, we design a local graph partitioning algorithm based on 
the same random walks as earlier. This algorithm, CutOrBound, finds a cut of (low) constant 
conductance if the walks do not mix, and takes only around 0(n°' 5+/i ) time, for any constant > 0, 
per vertex found in the cut. Now, we can remove this low conductance set, and run Simple on 
the induced subgraph. In the remaining piece, we recurse. Finally, we combine the cuts found 
randomly. This may leave up to half of the edges in the low conductance cut uncut, but that is 
only a small constant fraction of the total number of edges overall. This constitutes the Balance 
algorithm. We show that we spend only O(n 5+M ) time for every classified vertex, which leads to 
a 0(n L5+/1 ) overall running time. 

All of these algorithms are combinatorial: they only need random selection of outgoing edges, 
simple arithmetic operations, and comparisons. Although the analysis is technically involved, the 
algorithms themselves are simple and easily implementable. 

3 The Threshold Cut 

We now describe our core random walk based procedure to partition vertices. Some notation first. 
The graph G will have n vertices. All our algorithms will be based on lazy random walks on G 
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with self-loop probability 1/2. We define these walks now. Fix a length £ = O(logre). At each 
step in the random walk, if we are currently at vertex j, then in the next step we stay at j with 
probability 1/2. With the remaining probability (1/2), we choose a random incident edge {j, k} 
with probability proportional to Wjk and move to k. Thus the edge {j, k] is chosen with overall 
probability Wjk/2dj, where dj = k}eE w jk is the (weighted) degree of vertex j. Let A be an 
upper bound on the maximum degree. By a linear time reduction of Trevisan |TreOH ITre09| . it 
suffices to solve MaxCut on graphs^] where A = poly(logn). We set m to be sum of weighted 
degrees, so m := ^ ■ dj. We note that by Trevisan's reduction, m = 0(n), and thus running times 
stated in terms of m translate directly to the same polynomial in n. 

The random walk described above is equivalent to nipping an unbiased coin t times, and running 
a simple (non-lazy) random walk for h steps, where h is the number of heads seen. At each step of 
this simple random walk, an outgoing edge is chosen with probability proportional to its weight. 
We call h the hop-length of the random walk, and we call a walk odd or even based on the parity 
ofh. 

We will denote the two sides of the cut by L and R. The parameters e and u are fixed throughout 
this section, and should be considered as constants. We will choose the length £ of the walk to be 
/i(ln(4m/<5 2 ))/[2((5 + e)] (the reason for this choice will be explained later). We will assume that 7 
and 5 are arbitrarily small constants. The procedure Threshold takes as input a threshold t, and 
puts some vertices in one of two sets, Even and Odd, that are assumed to be global variables (i.e. 
different calls to Threshold update the same sets). We call vertices j £ Even U Odd classified. 
Once classified, a vertex is never re-classified. We perform a series of random walks to decide this. 
The number of walks will be a function of this threshold w(t). We will specify this function later. 



Threshold Input: Graph G = {V,E). Parameters: Starting vertex i, threshold t. 

1. Perform w(t) walks of length I from i. 

2. For every vertex j that is not classified: 

(a) Let yi(j) := (#{even walks ending at j} — #{odd walks ending at j}). 

(b) If yi{j) > t, put j in set Even. If yi(j) < —t, put it in set Odd. 

We normalize the difference of the number of even and odd walks by dj to account for differences 
in degrees. This accounts for the fact that the stationary probability of the random walk at j is 
proportional to dj. For the same reason, when we say "vertex chosen at random" we will mean 
choosing a vertex i with probability proportional to d%. We now need some definitions. 

Definition 3.1 (Work-to-output ratio.) Let A be an algorithm that, in time T , classifies k 
vertices (into the sets Even or Odd). Then the work-to- output ratio of A is defined to be 

Definition 3.2 (Good, Cross, Inc, Cut.) Given two sets of vertices A and B, let Good(A, B) 
be the total weight of edges that have one endpoint in A and the other in B. Let Cross(A, B) be the 
total weight of edges with only one endpoint in A U B. Let Inc(A,B) be the total weight of edges 
incident on AuB. We set Cut(A, B) := Good{A, B) + Cross(A, B)/2. 

Suppose we either put all the vertices in Even in L or R, and the vertices in Odd in R or L 
respectively, retaining whichever assignment cuts more edges. Then the number of edges cut is at 
least Cut(Even, Odd). 

3 We can think of these as unweighted multigraphs. 
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Definition 3.3 (a, w(t), a, f(cr).) 1. For every vertex j , let pj be the probability of reaching j 
starting from i with an l-length lazy random walk. Let a be an upper bound on m&Xj . 

2. Define w(t) := fvln ( n )™ ax i a '*J' ; j or a [ ar g e enough constant k. 

3. Define a := 1 — (1 — e) f — o(l) ; where the o(l) term can be made as small as we please by 
setting 5, 7 to be sufficiently small constants. 

4- Define the function f(o~) (c.f. [Sot09]) as follows: here <7n = 0.22815 ... is a fixed constant. 
If a > 1/3, then f(a) = 0.5. // a < a < 1/3, then f(a) = ~ 1+ ^ ( 4 1 fT _ 2 - ) 8 ' T+ ^ . Otherwise, 

~ 1+2^(1-0-) ' 

The parameter a measures how far the walk is from mixing, because the stationary probability of 
j is proportional to dj. The function f(o~) > 0.5 when a < 1/3, and this leads to an approximation 
factor greater than 0.5. Now we state our main performance bound for Threshold. 

Lemma 3.4 Suppose MaxCut > 1 — e. Then, there is a threshold t such that with constant prob- 
ability over the choice of a starting vertex i chosen at random, the following holds. The procedure 
Threshold^, t) outputs sets Even and Odd such that Cut(Even,Odd) > f(a)Inc(Even,Odd). 
Furthermore, the work-to- output ratio is bounded by 0(aAm 1+ ' 1 + 1/a). 

The main procedure of this section, Find-threshold, is just an algorithmic version of the 
existential result of Lemma 13.41 



Find-threshold Input: Graph G = (V,E). Parameters: Starting vertex i 

1. Initialize sets Even and Odd to empty sets. 

2. For t r = (1 — j) r , for r = 0, 1, 2, . . ., as long as t r > j/m 1+fM / 2 . 

(a) Run Threshold (i,t r ). 

(b) If Cut(Ev en, Odd) > f(a)lnc{Even,Odd) and \EvenUOdd\ > (At^n 1+fl log n) -1 , output 
Even and Odd. Otherwise go to the next threshold. 

3. Output FAIL. 



We are now ready to state the performance bounds for Find-threshold. 

Lemma 3.5 Suppose MaxCut > 1 — e. Let i be chosen at random. With constant probability 
over the choice ofi and the randomness o/Find-threshold(z), the procedure FlND-THRESHOLD(i) 
succeeds and has a work to output ratio of 0(aAm 1+ ^ + 1/a). Furthermore, regardless of the value 
of MaxCut or the choice ofi, the worst-case running time 0/ Find-threshold(?) is 0(aAm 2+M ). 



The proofs of Lemmas 3.4 and 3.5 use results from Trevisan's and Soto's analyses |Tre09llSot09| . 
The vectors we consider will always be n-dimensional, and should be thought of as an assignment of 
values to each of the n vertices in G. Previous analyses rest on the fact that a vector that has a large 
Rayleigh quotient (with respect to the graph LaplaciarQ can be used to find good cuts. Call such 
a vector "good" . These analyses show that partitioning vertices by thresholding over a good vector 
x yields a good cut. This means that for some threshold t, vertices j with x(j) > t are placed in L 



For a vector x and matrix M, the Rayleigh quotient is 
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and those with x(j) < —t are placed in R. We would like to show that Threshold is essentially 
performing such a thresholding on some good vector. We will construct a vector, somewhat like a 
distribution, related to Threshold, and show that it is good. This requires an involved spectral 



analysis. This is formalized in Lemma 3.7 With this in place, we use concentration inequalities 



and an adaptation of the techniques in [Sot09] to connect thresholding to the cuts looked at by 



Find-threshold. We first state Lemma [3.7| Then we will show how to prove Lemmas 3.4 and 3.5 



using Lemma 3.7 This is rather involved, but intuitively should be fairly clear. It mainly requires 
understanding of the random process that Threshold uses to classify vertices. 

We need some definitions. Let A be the (weighted) adjacency matrix of G and di be the degree 
of vertex i. The (normalized) Laplacian of the graph is C = I — D~ l l 2 AD~ 1 / 2 . Here D is the 
matrix where Da = di and Dij = (for i ^ j). For a vector x and coordinate/ vertex j, we use x(j) 
to denote the jth coordinate of x (we do not use subscripts for coordinates of vectors). In |Tre09j 
and |Sot09| , it was shown that vectors that have high Rayleigh quotients with C can be used to get 
a partition that cuts significant number of edges. Given a vector y, let us do a simple rounding to 
get partition vertices. We define the sets P(y,t) = {j \ y(j) > t} and N(y,t) = {j \ y{j) < —t}. 
We refer to rounding of this form as tripartitions, since we divide the vertices into three sets. The 
following lemma, which is Lemma 4.2 from |Sot09j . an improvement of the analysis in [Tre09j . 
shows that this tripartition cuts many edges for some threshold: 

Lemma 3.6 ([Sot09]) Suppose x T Cx > 2(1 — cr)||:z;|| 2 . Let y = D~ 1 / 2 x. Then, for some t (called 
goodj, Cut(P(y,t),N(y,t)) > f(a)Inc(P(y,t),N(y,t)). 

The algorithm of Trevisan is the following: compute the top eigenvector x of C (approximately) , 
compute y = D~ l / 2 x, and find a good threshold t and the corresponding sets P(y, t),N(y, t). Assign 
P(y,t) or N(y,t) to L and R (or vice-versa, depending on which assignment cuts more edges), and 
recurse on the remaining unclassified vertices. 

The algorithms of this paper essentially mimic this process, except that instead of computing the 
top eigenvector, we use random walks. We establish a connection between random walks and the 
power method to compute the top eigenvector. Let pfj be the probability that a length £ (remember 
that this is fixed) lazy random walk from i reaches j with hop-length h. Then define the vector qi as 

h Ji \ _ 1 I -\\hji 



follows: the jth coordinate of q { is %(j) := (£ h cvcn p id - £ fe odd^ij = ^ E/ l =o(- 1 ) Pi,y 

Note that Threshold is essentially computing an estimate y%{j) oi qi{j) / ^Jd~j . For convenience, 
we will denote D~ x l 2 qi by yi. This is the main lemma of this section. 

Lemma 3.7 Let 5 > be a sufficiently small constant, and [i > be a (constant) parameter. If 
£ = / u(ln(4m/5 2 ))/[2(5 + e')], where e' = — ln(l — e), then with constant probability over the choice 
°f h \\<li\\ 2 = ^(i/m 1+tM ), and 

qJC qi > 2e-^ S (l-e) 1+ ^\ qi f, (1) 



Although this not at all straightforward, it appears that Lemma 3.7 with Lemma 3.6 essentially 



proves Lemma 3.4 To ease the flow of the paper, we defer these arguments to Section 3.1 

Lemma |3.7| is proved in two parts. In the first, we establish a connection between the random 
walks we perform and running the power method on the Laplacian: 

Claim 3.8 Let e\ be i th standard basis vector. Then, we have qi = -^C} ( 
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Proof: Note that C l = (I-D-^AD- 1 / 2 ) 1 = (D^ 1 / 2 ^ - AD~ 1 )D 1 / 2 ) 1 = D' 1 / 2 ^- AD^D 1 / 2 . 
Hence, 



C e (-^=e^ = D- l ' 2 {I-AD- 

The last equality follows because the vector (fy (|) h (^AD~ 1 ) h a is the vector of probabilities 
of reaching different vertices starting from i in a walk of length I with hop-length exactly h. We 
also used the facts that D x l 2 A^e{ = ej and D~ x l 2 ej = — ^=e,-. □ 



In the second part, we show that with constant probability, a randomly chosen starting vertex 
yields a good starting vector for the power method, i.e., the vector qi satisfies 0. This will require 
a spectral analysis. We need some notation first. Let the eigenvalues of C be 2 > Ai > A2 > 
• • • A n = 0, and let the corresponding (unit) eigenvectors be U2, . . . v n = D l / 2 \y. For a subset 
S of vertices, define Vol(S') = J^ies^i- Let H = {k : A& > 2e~ 5 (l — e)}. Any vector x can be 
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expressed in terms of the eigenvectors of C as x = Y2k a k^k- Define the norm \\x\\h = \J^2k&H a k 
Let (S,S) be the max-cut, where we use the convention S = V \ S. Let Vol(S) < Vol(V)/2 
and define s := Vol(S'). Note that m = Vol(V). Since the max-cut has size at least (1 — e)m/2, we 
must have s > (1 — e)m/2. We set the vector x = D^^y where y = -lg — -ly where I5 is the 
indicator vector for 5. We will need some preliminary claims before we can show ([!]). 

Claim 3.9 II 

^Hff ^ 5 /vn. 
Proof: We have 

t „ , /.v , „i 1 (1 — e) • (m/2) (1 — e)m 

x T Cx = £>(«)- vfc)) 2 = E(S,S)--z > { - > 

i,jeE 

NowQ \\x\\ 2 = j — i. Let x = ^2k a k v k be the representation of x in the basis given by the Ufc's, 
and let a := H^Hf^. Then we have ||x|| 2 = Ylk a \i anc ^ 

x T Cx = X ka\ <2^2a 2 k + 2e~\\ -e)J2a 2 k = 2a + 2e~ s (l - s) [ a). 

k keH k<^H ^ S m ' 

Combining the two bounds, and solving for a, we get the required bound for small enough 5. □ 
Claim 3.10 With constant probability over the choice of i, 1 1 1 1 ^ > 5di/4m. 

Proof: Let T := {i G S : H^^lln < i^i' ancl let * = Vol(T). Our aim is to show that t 
is at most a constant fraction of s. For the sake of contradiction, assume t > (1 — #)s, where 
9 = 6(1- e)/16. Let z = £> 1/2 (flr - ^1)- We have 

.12 1 ll2 1 1 2(9 46* (5 

F — -2 Iff < F ~~ z = — — < 7^ 7 — = - — 

t s s (1 — ejm 4m 

5 This is easily seen using Pythagoras: since D 1 / 2 {-ls — ^lv) ' D 1 ' 2 ly = 0. This only uses the fact that SCV. 
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The second equality above uses the fact that t > (1 — 9)s and 9 < 1/2. The third inequality follows 
from s > (1 — e)m/2. By the triangle inequality and Claim 3.9, we have 

\\z\\h > \\x\\h - \\x - z\\ H > 
Now, we have z = YlieTi^t)® 1 ^^^ — ^lV)> so by Jensen's inequality, we get 





4m 



H 



< 



E 



D 1 ' 2 



1 



E 



Dl/2 ' 



< 



H 



Am ' 



a contradiction. The equality in the chain above holds because D^^ly has no component along 
the eigenvectors corresponding to H (this is an eigenvector itself, with eigenvalue 0). 

Thus the set S \ T has volume at least 6s > 5(1 — e) 2 m/32. Note that the sampling process, 
which chooses the initial vertex of the random walk by choosing a random edge and choosing a 
random end-point i of it, hits some vertex in S\T with probability at least X^j^y^ 
i.e. constant probability. □ 



At this point, standard calculations for the power method imply Lemma 3.7 



3.10 



with constant probability ||ej||^ > Sdi/Am. Let us 



Proof: (Of Lemma 3.7) From Claim 
assume this is case. 

- , so that the number of walks is I = ln ( 4 ^/^ ) . Now let 



For convenience, define /3 



H' = {i : Xi > 2e ( 5+ ^)(l — e)}. Write ~^^% in terms of the v^s as -^j^i = Yl,k a k v k- Let 
iji = ^-h^ei = a k^l v k- Note that q. t = ^1%. Then we have 



and 



We have 



Thus, 



k keH' 



Y. a k X l £= °k X 



keH' 



1 + 



^2k(^H' a k X ¥ 

Y,keH' a l X l e £ 



n 2 \ 2t - 
l^kjH' a k A k 

J2keH' a l X f 

2t 



EmWaT (2e-('+«(l- £ )) 



Ame 2/3e 



l-_H a k X l 



5 
in/ 



(2e-'(l-e)) 



2/ 



yjm > 2e-(*+«(l-e) 



1 + 



4me" 2 ^' 



Observe that xji is just a scaled version of so we can replace yi by % above. For the denominator 



in the right, we would like to set that to be e . Choosing 



ln(4m/<5 2 ) 
2/3 



, we get 



qi Cqi > 2e 



-(26+0) 



(1-e) 



-(2+ i )<5/ 1 \\2 

2e v f' 1 — e) i* \\Qi \\ . 



Since [ | [ | ^ > 8di/Am, we have 



Y a l 

keH 



> 



H 



Am 
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This implies 



I II 2 1||^||2 1 \~~ > ' 2 \2£ ^ 1 2\2^ 

W\\ = 22p2/ill = 7^2^°^ - ^2lZ^ a fc A fe- 



By definition, for all ke H, X k > 2e~ s (l - e). This gives a lower bound on the rate of decay of 
these coefficients, as the walk progresses. 



> ^£4(2e-*(l- £ )r > ^e-^l-er = «( 



4 m i+M 



aJC2e-(l- e jr > / ' >! ' ' 

keH 

by our choice of £ = ln ( 4 ™J s ) . □ 
3.1 Proofs of Lemmas 13.41 and 13.51 



Both Lemma 3.5 and Lemma 3.4 follow directly from the following statement. 



Lemma 3.11 Let w(t) = (c'ln 2 /-y 2 )(a/t 2 ), where d is a sufficiently large constant. Let C r denote 
the set of vertices classified by Threshold (i,t r ). The following hold with constant probability 
over the choice of i and the randomness of Threshold. There exists a threshold t r = (1 — j) r 
such that YljeC r dj = ^{{tlm^^logn)^ 1 ) . Also, the tripartition generated satisfies Step 2(b) of 
Find-threshold. 



In this section, we will prove this lemma. But first, we show how this implies Lemmas 3.4 
and 13.51 



Proof: (of Lemma 3.4) We take the threshold t r given by Lemma 3.11 Since it satisfies Step 
2(b) of Find-threshold, Cut (Even, Odd) > f(a)lnc(Even,Odd). To see the work to output 
ratio, observe that the work done is 0(w(t r )) = 0(max(a, t^/t 2 .). It is convenient to write this as 
0(ajt^ r + 1/a). The output is C r . We have 

AICJ >Vd, = fl( 9 * , ) 

j£C r r 

The output is at least 1. Therefore, the work per output is at most 0(aAm 1+ ' 1 + 1/a). □ 



Proof: (of Lemma 3.5) The running time when there is failure is easy to see. The running time 



upto round r is Q(Ylj<r max(q, tj )/t 2 ) = 0(a/t 2 + 1/a). Since r* = l/n 1+M / 2 and a < 1/n 2 , we 



get the desired bound. By Lemma 3.11, we know that Find-threshold succeeds with high prob- 
ability. We have some round r where Find-threshold will terminate (satisfying the conditions 
of Step 2(b)). The work to output ratio analysis is the same as the previous proof, and is at most 
0(aAm 1+ f + 1/a). □ 



We will first need some auxilliary claims that will help us prove Lemma 3.11. The first step is 
the use concentration inequalities to bound the number of walks required to get coordinates of y,. 
As mentioned before, we designate the coordinates of qi by qi(j). The pi vector is the probability 
vector of the random walk (without charges) for I steps. In other words: 

UiU) := (PrfWalk from i reaches j in even path] — Pr[Walk from i reaches j in odd path])/cL- 

and 

Pi(j) := Pr[Walk from i reaches j in even path] + Pr[Walk from i reaches j in odd path] 
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This clearly shows that the random walks performed by Threshold are being used to estimate 
coordinates of qt. The following claim shows how many w walks are required to get good a approx- 
imation of coordinates 

Claim 3.12 Suppose w walks are performed. Let c be a sufficiently large constant and 1/lnn < 
7 < 1. The following hold with probability at least > 1 — n -4 . 

• If w > (clnn/7 2 )(max(a, t)/t 2 ), then we can get an estimate y~i(j) such that \/~di\yi(j) — 
Vi(j)\ < jt. 

• If w > (clnn/j 2 )m 1+tl , then we can get an estimate yi{j) such that \fdj\y~i(j) — qi(j)\ < f3j, 
where ft := yj * ^fag^M , 

Proof: We define a vector of random variables X k , one for each walk. Define random variables 
X k (j) as follows: 

1 walk k ends at j with even hops 
Xk(j) = < — 1 walk k ends at j with odd hops 
walk k doesn't end at j 

Note that E[X k (j)] = yi(j)dj, and Var[X k (j)] = pj. Our estimate y~i(j) will be ~"£2 k X k (j). 
Observing that < 1, Bernstein's inequality implies that for any j3 > 0, 



Pr 



1 w 



' J k=l 



< 2 exp 



3w(3 2 d 2 



6 Pi (j) + 2f3dj 



For the first part, we set j3 = ytj 'Wdj. For a sufficiently large c, We get that the exponent is at least 
41nn, and hence the probability is at most 1/n 4 . For the second part, we set /3 = @j/\/dj- Note 
that if pj < then f3j < l/m 1+)1 . So, the exponent is at least 4 Inn, completing the proof. □ 



We need to find a vector with a large Rayleigh quotient that can be used in Lemma 3.6 We 
already have a candidate vector Although we get a very good approximation of this, note that 
the order of vertices in an approximation can be very far from q^. Nonetheless, the following lemma 
allows us to do so. 

Claim 3.13 Let x be a vector such that x 1 Cx > (2 — e)||x|| 2 . Then, if x' is a vector such that 
\\x - x'\\ < 5\\x\\, then \\x'\\ 2 > (1 - 3<5)||x|| 2 and x' T Cx' > (2 - e - 125)||x'|| 2 . 

Proof: We have 

x' T Cx' — x T Cx = x' T Cx' — x' T Cx + x' T Cx — x T Cx = {x — x) T C(x + x). 

Thus, 

\x' T Cx' - x T Cx\ < (ll^'ll + \\x\\) • ||£|| • \\x-x'\\ < (2 + <y)[|s||.2-5[|a?[| < 65||x|| 2 . 
Furthermore, 



„'l|2 



< 



x + X 



\x-x'\\ < {2 + 5)\\x\\ -811x11 < 35\\x\\ 2 . 



Thus, we have 



x' T Cx' > x T Cx - 68\\x\\ 2 > (2 - e - 65)\\x\\ 2 > (2 , £ ^- \\x'\\ 2 
ii ii - v 7ii ii - (1 + 3,5) 11 11 



> (2 



125)\\x' 



/i|2 
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□ 

Now we prove Lemma |3,11| 
Proof: Our cutting procedure is somewhat different from the sweep cut used in |Tre09j . The 



most naive cut algorithm would take q% and perform a sweep cut. Lemma 3.7 combined Lemma 3.6 



would show that we can get a good cut. Unfortunately, we are using an approximate version of yi 



(yi) for this purpose. Nonetheless, Claim 3.12 tells us that we can get good estimates of y», so yi is 



close to yi. Claim 3.13 tells us that is good enough for all these arguments to go through (since 



Lemma 3.6 only requires a bound on the Rayleigh quotient). 



Our algorithm FlND-THRESHOLD is performing a geometric search for the right threshold, invok- 

-(r) 

ing Threshold many times. In each call of the Threshold, let estimate vector y\ be generated. 
Using these, we will construct a vector yi. This construction is not done by the algorithm, and is 
only a thought experiment to help us analyze Find-threshold. 

Initially, all coordinates of y~i are not defined, and we incrementally set values. We will call 
Threshold^, t r ) in order, just as Find-threshold. In the call to Threshold^, t r ), we observe 

that vertices which are classified. These are the vertices j for which y^\j) > t r and which have 
not been classified before. For all such j, we set y~i(j) := t r . We then proceed to the next call of 
Threshold and keep continuing until the last call. After the last invocation of Threshold, we 
simply set any unset yl(j) to 0. 

Claim 3.14 {{D 1 / 2 ^ - q t \\ < 7 7 ||%|| 

Proof: Suppose yi(j) > t r (l + 47). Note that ||y- r ~^(j) - qi(j)\\ < T^r-i/Vj- Therefore, 

Vf^ij) > *r(l + 47) - 7*r-l > <r(l + 4 7) " 7(1 + ^l)U > *r(l + ^l) > *r-l 

So yi(j) must be set in round r — 1, if not before. If yfi{j) remains unset to the end (and is hence 0), 
then yi(j) < t r (l + 47). This implies that qi(j) < 2 7 /m 1+ ^/ 2 . The total contribution of all these 
coordinates to the difference ||-D 1//2 ?/j — qi\\ 2 is at most 4j 2 /m 1+ ^ < 47 2 ||gj|| 2 . 

Su ppose y~i(j) is set in round r to t r . This means that y~l r \j) > t r . By the choice of w(t r ) and 
Claim 



3.12 



fdjWi U) ~ Vii.j)\ < 7*r- Therefore, 



\VdjVi r) U) ~ ffi0')l < 7*r < 2 7qi (j) 
=> yd j yf\j)<{l + 2 1 )q l {j) 
=^ s/djViU) = \fdjU < (1 + 2-y)qi(j) 
Combining with the first part, we get \\/djyi{j) — Qi{j)\ — *WQi(J)' a 

We now observe that sweep cuts in y~i generate exactly the same classifications that Thresh- 
old^, t r ) outputs. Therefore, it suffices to analyze sweep cuts of yi. We need to understand why 
there are thresholds that cut away many vertices. Observe that the coordinates of y~i are of the form 
(1 — 7) r . This vector partitions all vertices in a natural way. For each r, define R r := {j\y~i(j) = t r }. 
Call r sparse, if 

(Ydj)4<-^A — 

V J) r - m l+Atl ogn 

jeRr 



Otherwise, it is dense. Note that a dense threshold exactly satisfies the condition in Lemma 3.11 
Abusing notation, we call a vertex j sparse if j £ R r , such that r is sparse. Similarly, a threshold t 
is sparse if r is sparse. We construct a vector %. If j G R r , for r sparse, then yi(j) := 0. Otherwise 

mU) ■= yiti)- 



r 
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Claim 3.15 {{D 1 / 2 ® - m)\\ < 2 7 || gi || 
Proof: 

\\D^(Si-m 2 = E d Mif = E E^') 2 ) = A E E d A 2 

j:yj(j')=0 r:r sparse jgi? r r:r sparse jgR r 

. 41ogn 7 3 4 7 2 2 2 

iTTTi — = i +u1 — <7 Mm] 

7 m i+ ^logn m i+ ^logre 

□ 

Let us now deal with the vector % and perform the sweep cut of |Tre09] . All coordinates of % 
are at most 1. We choose a threshold t at random: we select t 2 uniformly at randonj^] from [0, 1]. 
We do a rounding to get the vector z t G { — 1,0, l} n : 



'i XyiU)>t 
-i iiyi{j)<-t 
,0 if|yt(i)l<< 



The non-zero vertices in Zt are classified accordingly. A cut edge is one both of whose endpoints are 
non-zero and of opposite size. A cross edge is one where only one endpoint is zero. This classifying 



procedure is shown to cut a large fraction of edg es. By Lem ma 3.7 w e have q- Cqi > 2(1 — e 



(where e is some function of e and n). By Claims |3.14[|3.15| and Claim 3T3l (D 1 / 2 y i ) T £( J D 1 / 2 gi) > 



2(1 — e — c7)||-D 1 / 2 yj|| 2 . Then, by Lemma 3.6, there are good thresholds for It remains to prove 
the following claim. 

Claim 3.16 There are thresholds for y~i that are dense and good. 

Proof: We follow the analysis of |Sot09j . We will perform sweep cuts for both yi and yi and 
follow their behavior. First, let take the sweep cut over xji. Consider the indicator random variable 
C(j,k) (resp. X(j,k)) that is 1 if edge (j,k) is a cut (resp. cross) edge. It is then show that 
E[C(j,k) + l3X(j,k)] > P(l - P){yi{j) ~ Viik)) 2 , where the expectation is over the choice of the 
threshold t. Let us define a slight different choice of random thresholds. As before t 2 is chosen 
uniformly at random from [0, 1]. Then, we find the smallest t r such that r is dense and t r > t. We 
use this t* := t r as the threshold for the cut. Observe that this gives the same distribution over cuts 
as the original and only selects dense thresholds. This is because in yi all non-dense vertices are 
set to 0. All thresholds strictly in between two consective dense t r 's output the same classification. 
The expectations of C(j, k) and X(j, k) are still the same. 

We define analogous random variables C'(j, k) and X'(j, k) for yi. We still use the distribution 
over dense thresholds as described above. When both j and k are dense, we note that C'(j,k) = 
C(j, k) and X'(j, k) = X(j, k). This is because if t falls below, say, yi(j) (which is equal to y~i(j)), 
then j will be cut. Even though t* > t, it will not cross yl(j), since j is dense. So, we have 
E[C'(j, k) + px'tj, k)] = E[C(j, k) + 0X(j, k)}. 

If both j and k are not dense, then C'(j, k) = X'(j, k) = 0. Therefore, E[C(j, k) + (3X(j, k)] > 
E[C'(j,k) + (3X'(j,k)]. That leaves the main case, where k is dense but j is not. Note that 
E[C(j,k)} = 0, since m{j) = 0. We have E[X(j,k)} = yi{k) 2 = m{k) 2 . If \yfi(j)\ < \m{k)\, then 
E[X'(j,k)] = m{k) 2 - y t (j) 2 . If < \m(k)l then E[X'(j,k)} > > yi{k) 2 - y^j) 2 . So, we can 

bound E[A'(j,£;)] > E[X(j, fc)]-y 4 (j) 2 and E[C'(j, A:)+/3A'(j, A:)] > P(l-P)(yi(j)-yi(k)) 2 -Pyi(j) 2 - 



6 Both Tre09] and |Sot09j actually select t uniformly at random, and use \ft as a threshold. We do this modified 
version because it is more natural, for our algorithm, to think of the threshold as a lower bound on the probabilities 
we can detect. 



12 



Summing over all edges, and applying the bound in Lemma 4.2 of 3.6 for the non-prime random 
variables (dealing with we get 

E[J2C(j,k) + j3X%k)] > E[J2c(j,k)+j3X(j,k)}-/3 £ d^j) 2 

(j,k) (j,k) j sparse 

> (3) (Uj)-m)) 2 -Pi 2 \\D l/2 m\\ 2 

(j,k) edge 

> 2(l-a)P(l-P)\\D 1 / 2 

> 2(1 - a )f3(l-(3)\\D ll2 y 



yi \\-- 4/3(1 - ptfWD 1 / 2 ^ 



The second last step comes from the bound on (D 1 / 2 yi) T C(D 1 ^ 2 yi) we have found, and the obser- 
vation that (3 will always be set to less than 1/2. We have 1 — a = e~^ 2S+fl \l — e) — 0(7) (based 
on Lemma 3.7 Since \a — a\ = 0(7), we get a as given in Lemma 3.5 Because of the equations 
above, the analysis of [Sot09] shows that the randomly chosen threshold t* has the property that 

Cut(P(yi,f),N(y u f)) > f(v)1nc{P(yi,t*),Nfa,t*)) 

Therefore, some threshold satisfies the condition 2(b) of Find-threshold. Note that the thresh- 
olds are chosen over a distribution of dense thresholds. Hence, there is a good and dense threshold. 
□ 

□ 



4 CutOrBound and local partitioning 

We describe our local partitioning procedure CutOrBound which is used to get the improved 
running time. We first set some notation. For a subset of vertices S C V, define S = V \ S, and let 
E(S,S) be the set of edges crossing the cut (S,S). Define the weight of S to be lj(S) = 2Vol(S*), 
to account for the self-loops of weight 1/2: we assume that each vertex has a self-loop of weight 
di, and the random walk simply chooses one edge with probability proportional to its weight. 
For convenience, given a vertex j, co(j) = ca({j}) = 2dj. For a subset of edges FOE, let 

^(F) = J2e£F w e- The conductance of the set S, <j>s, is defined to be <ps = m in{^(S) S J(S)} ' 



CutOrBound Input: Graph G. Parameters: Starting vertex i, a = m T ,£ = ln(m)/(. 

1. Define <j) to satisfy - log(|( v / l -20 + ^1 + 2(f))) = (t, w = \30£ 2 ln(n)/a] = 0(log 3 (n)/a), 

b = T 2(1-20) J = Q(log(n)/a). 

2. Run w random walks of length £ from i. 

3. For each length I = 0, 1, 2, . . . , I: 

(a) For any vertex j, let Wj be the number of walks of length I ending at j. Order the 
vertices in decreasing order of the ratio of Wj/dj, breaking ties arbitrarily. 

(b) For all k <b, compute the conductance of the set of top k vertices in this order. 

(c) If the conductance of any such set is less than cj>, stop and output the set. 

4. Declare that maxj < 256a. 
The main theorem of this section is: 
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Theorem 4.1 Suppose a lazy random walk is run from a vertex i for I = ln(m)/£ steps, for 
some constant (. Let p^ be the probability distribution induced on the final vertex. Let a = m~ T , 
for constant r < 1, be a given parameter so that (t < 1/8, and let 4> be chosen to satisfy 
- log(i(x/l - 2(f) + VI + 20)) = Ct. Then, there is an algorithm CutOrBound, that with proba- 
bility 1 — o(l), in 0(log 4 (n)/a) time, finds a cut of conductance less than 4>, or declares correctly 
p e 

that maxj ^j- < 256a. 

We provide a sketch before giving the detailed proof. We use the Lovasz-Simonovits curve tech- 
nique [LS90]. For every length I = 0, 1, . . . ,£, let p l be the probability vector induced on vertices 
after running a random walk of length I. The Lovasz-Simonovits curve : [0, 2m] — > [0, 1] is 

constructed as follows. Let ji, j2, • • • , jn be an ordering of the vertices such that > > 

... > In 

For k G {1, . . . ,n}, define the set S l k = j'2, . . . ,jk}- For convenience, we define S l Q = 0, the 
empty set. For a subset of vertices S, and a probability vector p, define p(S) = YliesPi- Then, 
we define the curve P at the following points: L l (u(S l k )) := p l (S l k ), for k = 0, 1, 2, ... , n. Now we 
complete the curve by interpolating between these points using line segments. Note that this 
curve is concave because the slopes of the line segments are decreasing. Also, it is an increasing 
function. Lovasz and Simonovits prove that as / increases, J "flattens" out, at a rate governed by 
the conductance. A flatter means that the probabilities at vertices are more equal (slopes are 
not very different), and hence the walk is mixing. 

Roughly speaking, the procedure CutOrBound only looks the portion of I 1 upto S l b , since it 
only tries to find sweep cuts among the top b vertices. We would like to argue that if CutOrBound 
is unsuccessful in finding a low conductance cut there, the maximum probability should be small. 
In terms of the L l s, this means that the portion upto S l b flattens out rapidly. In some sense, we 
want to prove versions of theorems in |LS90| that only talk about a prefix of the I 1 curves. 

The issue now is that it is not possible to compute the p^s (and I 1 ) exactly since we only use 

random walks. We run walks of length I and get an empirical distribution p l . We define P to be the 
corresponding Lovasz-Simonovits curve corresponding to p l . If we run sufficiently many random 
walks and aggregate them to compute p*- , then concentration bounds imply that pj is close to p^ 
(when pj is large enough) . Ideally, this should imply that the behavior of P is similar to I 1 . There 
is a subtle difficulty here. The order of vertices with respect to p l and p l could be very different, 
and hence prefixes in the /' and P could be dealing with different subsets of vertices. Just because 
L l is flattening, it is not obvious that P is doing the same. 

Nonetheless, because for large p'-'s, pl- is a good approximation, some sort of flattening happens 
for P. We give some precise expressions to quantify this statement. Suppose CutOrBound is 
unable to find a cut of conductance (j). Then we show that for any x G [0, 2m], if x = min{:r, 2m — x}, 

P(x) < —{p- 1 {x-2cf>x) + p- l (x + 2(f>x))+A5ax. 

This is the flattening from / — 1 to /. Since -r 1 is concave, the averaging in the first part shows 
that P(x) is much smaller than P~ l (x). Note that additive error term, which does not occur 
in [LS90J. This shows that when x is large, this bound is not interesting. That is no surprise, 
because we can only sample some prefix of I 1 . Then, we prove by induction on / that, if we define 
tj; = - fog(i( v / l - 2(f) + VI + 24))) = Cr, then P(x) < e 3Sl [V^-^ + ^] + 4e 4Sl ax. Assuming 
that 5 ~ l/£, the e~^ 1 term decays very rapidly. For the final i = f2(log(re) / 'ip) , we are only left with 

the error term, which will be O(a). We then get maxj = ^(1) < 0(e~^ + ^ + a) < 0(a). 
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4.1 Proof of Theorem 14.11 

First, we note that cj> < \/2(t, so 1 — 2<f> > 0. Consider the following algorithm: 

It is easy to see that this algorithm can be implemented to run in time 0(log 4 (n)/a). We now 

prove that this algorithm has the claimed behavior. We make use of the Lovasz-Simonovits curve 

technique. For every length I = 0, 1, ...,£, let p l be the probability vector induced on vertices after 

running a random walk of length I. 

Now, we construct the Lovasz-Simonovits curve [LS90J, P : [0,2m] -> [0,1] as follows. Let 

ji, j2, ■ ■ ■ ,j n be an ordering of the vertices as follows: 

p l ji > A_ > ... > A 



^(il) W(j 2 ) W(jn) 

For G {l,...,n}, define the set S 1 ^ = j2> • • ■ ,jk}- For convenience, we define S l = 0, the 
empty set. For a subset of vertices S, and a probability vector p, define p(S) = ^2 i&s Pi- Then, 
we define the curve I 1 at the following points: l\co(S l k )) := p\S l k ), for k = 0, 1, 2, ... ,n. Now 
we complete the curve I 1 by interpolating between these points using line segments. Note that 

the slope of the line segment of the curve at the points u(S l k ), ui(S l k+1 ) is exactly ■ A direct 

definition of the curve is the following: for any point x G [0,2m], if k is the unique index where 

x G HS l k )MS l k+i)), then l\x) = p\S l k ) + (x - • 
An useful alternative definition for /'(x) is the following: 

= max^^p-u>j subject to ^i,^, . . . ,w n G [0, 1]; ~^2uj(i)wi < x. (2) 

Note that this curve is concave because the slopes of the line segments are decreasing. Also, it 
is an increasing function. Now, Lovasz and Simonovits prove the following facts about the curve: 
let S C V be any set of vertices, and let xs = and (frs be its conductance. For x G [0,2m], 

define x = min{x, 2m — x}. Then, we have the following: 

P \S) < l - (I 1 " 1 (x s - 20 5 %) + I 1 " 1 (x s + 205%))- (3) 

Furthermore, for any x G [0,2m], we have P(x) < 

The issue now is that it is not possible to compute the p'-'s exactly since we only use random 
walks. Fix an error parameter 5 = 1/1. In the algorithm CutOrBound, we run w = c ■ — ■ ln(ra) 
walks of length I, where c = 30/5 2 . For each length /, < I < £, consider the empirical distribution 
p l induced by the walks on the vertices of the graph, i.e. pj = Wj/w, where Wj is the number of walks 
of length I ending at j. We search for low conductance cuts by ordering the vertices in decreasing 
order of p l and checking the sets of top k vertices in this order, for all k = 1, 2, . . . , 0(l/5a>). 
This takes time 0(w£). To show that this works, first, define / be the Lovasz-Simonovits curve 
corresponding to p l . Then, we have the following: 

Lemma 4.2 With probability 1 — o(l), the following holds. For every vertex subset of vertices 
S C V , we have 

(l-5)p\S)-5au(S) < p) < {l + 5)p l (S) + 5auj{S). 
For every length I, and every x G [0, 2m], 

(l-5)I l (x) - Sax < P{x) < (1 + 5)I l (x) + 5ax. 
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PROOF: For any vertex j, define 5j = 5(p l j + a). By Bernstein's inequality, we have 

Pr[|$-p5|><y < 2exp ^~ 2pt ^'^. /3 ^ < 2exp(-5 2 cln(n)/3)<l/n 10 

since c = 30/<5 2 . So with probability at least 1 — o(l), for all lengths I, and for all vertices j, we 
have 

(1-8)$ -6a < p) < {l + 5)p) + 8a. 
Assume this is the case. This immediately implies that for any set S, we have 

(l-S)p l {S)-Sa\S\ < p l (S) < (l + 5)p l (S) + 5a\S\. 

Now, because both curves /' and I are piecewise linear, concave and increasing, to prove 
the lower bound in the claimed inequality, it suffices to prove it for only x = x k = oj(S k ), for 
k = 0, 1, . . . , n. So fix such an index k. 

Now, I l (x k ) =p l (S l k ). Consider p l (S l k ). We have 

P 1 (S[) > (l-5)p l (S{)-5a\S{\ > (l-S)p l (S l k )-Sau;(S l k ). 

Now, the alternative definition of the Lovasz-Simonovits curve m implies that I l (u(S l k )) > p l (S l k ), 
so we get 

I\x k ) > (l-5)p\S[)-5ax k , 

as required. The upper bound is proved similarly, considering instead the corresponding sets S l k 
for F consisting of the top k vertices in p l probability. □ 

The algorithm CutOrBound can be seen to be searching for low conductance cuts in the top 
b vertices in the order given by p l -/uj(j). Now, we prove that if we only find large conductance cuts, 
then the curve I "flattens" out rapidly. Let j'i,j' 2 , ■ ■ ■ , j' n be this order. Let S l k = {j[,j2, ■ ■ ■ , j k } 
be the set of top k vertices in the order, x k = u(S k ), and 4>k be the conductance of S k . Now we 
are ready to show our flattening lemma: 

Lemma 4.3 With probability 1 — o(l), the following holds. Suppose the algorithm CutOrBound 
finds only cuts of conductance <fi when sweeping over the top b vertices in p l probability. Then, for 
any index k = 0, 1, . . . , n, we have 

P 1 (S{) < ^(I l - 1 {x k -24>x k ) + I l - 1 (x k + 2(Px k )) + 8a ( f>x k . 

Proof: Let G= ij : ^>Sa\. We have 1 > p l ~ 1 (G) > 6au(G), so u){G) < 1 /5a. 

As defined in the algorithm CutOrBound, let b = \ 2(i-2<j>)6a \ • a ^ e l ar g es t index so 
that p\, > 0. If a < 6, then let Z be the set of b — a vertices k of zero p l probability considered by 

Jet 

algorithm CutOrBound for searching for low conductance cuts. We assume that in choosing the 
ordering of vertices to construct F , the vertices in Z appear right after the vertex j' a . This doesn't 
change the curve I 1 since the zero p l probability vertices may be arbitrarily ordered. 

Suppose that the algorithm CutOrBound finds only cuts of conductance at least <p when 
running over the top b vertices. Then, let k be some index in 0, 1, . . . , n. We consider two cases for 
the index k: 
Case 1: k < b: 
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In this case, since the sweep only yielded cuts of conductance at least </>, we have 4> k > (f>. Then ^ 
implies that 



p\S{) < ^(/ ( - i (x fc -2# fc )+/ ( - i (x fc + 2</)x fc )). 



rl-l 



l-l. 



Case 2: k > b: 

We have 



x k > x b 



co(S l b ) > 2b > — 



1 



> 



MCf). 



(1 - 2(/))5a 1-2(1) 

Thus, u)(G) < (1 — 2(j))x k < x k — 2(f)Xk- Hence, the slope of the curve at the point x k — 2cf)Xi t 
is at most 5a. Since the curve is concave and increasing, we conclude that 

I l - 1 {x h -2<j,x k ) > I l - 1 (x k )-25a^x k , 

and 



I l -\x k + 2(j ) x k ) > I l -\x k ) 



t-l. 



Since p l {S[) < I l (x k ) < I l ~ l {x k ), 

p\S[) < ^(I l - 1 {xk-2^x k ) + I l - 1 (x k + 2(f ) x k )) + 5a4>x k . 

This completes the proof of the lemma. □ 

Since the bounds of Lemma 4.2 hold with probability 1 — o(l), we assume from now on that is 
indeed the case for all lengths /. Thus, we conclude that if we never find a cut of conductance at 
most (f), and for any index k = 0, 1, . . . , £, we have 



I l k (x k ) = p{(S{) 

< (l + S)p l k (S{) + dax k 



by Lemma 4.2 



< 
< 



i^— (/' 1 (x k -2(f)x k )+I l 1 (x k + 2cj)x k )) + 25ax k 
1 + 5 



by Lemma 4.3 



(P 1 {x k -2(j)x k )+I i L (x k + 2(j)x k )) + 45ax k 



7i-i 



2(1-5) 



by Lemma 4.2 



Here, we use the facts that (1 + 8)<f> < 1, and < 2. Now, because is a piecewise linear and 
concave function, where the slope only changes at the x k points, the above inequality implies that 
for all x G [0, 2m], we have 

I l {x) < —(I l - 1 (x-2<px) + I l - l (x + 2<f)x))+4:5ax. 

Here, we used the bound < e 3 5. 

Now, assume that we never find a cut of conductance at most <p over all lengths /. Define 
V> = -log(i(yi - 2(j) + VI + 20)) = (t. Note that tp > (j?/2. Then, we prove by induction on I 
that 



I l (x) < e 



.re 



+ 



2m 



+ Ae m ax. 



The statement for / = is easy to see, since the curve I°(x) = min{x/2c?i, 1} (recall that we start 
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the walk at vertex i). Assuming the truth of this bound for I — 1, we now show it for I. We have 



35 

l\x) < —{I l ~ 1 {x-2cf>x) + I l - l (x + 2<j)x))+A5ax 



,35 



< 



=.35(1-1) 



2 

+ ASax 



\/ ' (x^2^x)e^ 1 -^ + \/ (x^~20x)e-^ l -V 



+ 



2x 
2m 



+ 8e 4 ^" 1 W 



< e 



35/ 



xe~^ + 



2m 



+ 4e 4,5 W 



which completes the induction. In the last step, we used the following bounds: if x < m, then 

\J{x- 2cj)x) + \j (x + 24>x) < sjx - 2(j)x + \Jx + = 2x6"^, 
and if x > m, then 

\/ (x"^20x) + ^/ (x~+~20x) < y^m - (x - 2^(2m - x)) + v^m - (x + 20(2m - x)) = 2xe" ,/ '. 



Since S =1/1, we get 



max 



assuming a = m , 



3 2di 



In m 

C ' 



< e -^+ 3 + _ + 4e 4 a < 250a, 
2m 



and ip = Qt. Finally, again invoking Lemma 4.2 we get that 



msxpj/2dj < 256a, since 5 = l/£. 



5 Recursive partitioning 

Given the procedure Find-threshold, one can construct a recursive partitioning algorithm to 
approximate the MaxCut. We classify some vertices through Find-threshold, remove them, 
and recurse on the rest of the graph. We call this algorithm Simple. The algorithm Balance uses 



the low conductance sets obtained from Theorem 4.1 and does a careful balancing of parameters 
to get an improved running time. All proofs of this section, including theoretical guarantees on 



approximation factors, are in Appendix 5.1 We state the procedure Simple first and provide the 
relevant claims. 



Simple Input: Graph G. Parameters: e, /x, a. 

1. If f(a(e,/j,)) = 1/2, then put each vertex in L or R uniformly at random (and return). 

2. Let P be a set of O(logra) vertices chosen uniformly at random. 

(a) For all i 6 P, run procedures Find-threshold(z, /j,) in parallel. Stop when any one of 
these succeeds or all of them fail. 

3. If all procedures failed, output FAIL. 

4. Let the successful output be the set Evern and Oddi. With probability 1/2, put Everii in L 
and Oddi in R. With probability 1/2, do the opposite. 

5. Let £ = 1 — lnc(Everii, Odd^/m. Set e' = e/£ and G' be the induced subgraph on unclassified 
vertices. Run SIMPLE ((?', e', fx). If it succeeds, output the final cut L and R. 

6. If G is the original graph, put each vertex (even those already classified) randomly in L or 
R. Irrespective of G, output FAIL. 
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The guarantees of Simple are in terms of a function iJ(e,/i). For a given e and let z* be 
the largest value such that f(a(e/z*,fi)) = 1/2. Then H(e,fi) := z*/2 + f z * f(a(e/z,n))dz. For 
constant e < 0.5, H(e, n) is a constant > 0.5. 

Lemma 5.1 Let MaxCut(G) = 1 — e. There is an algorithm Simple'(G, fi) that, with high 
probability, outputs a cut of value H(e,fi) — o(l) ; and thus the worst-case approximation ratio is 
min e ^fep — o(l). The running time is 0(Am 2+/i ). 

Tie algorithm Simple' is a version of Simple that only takes /i as a parameter and searches 
for the appropriate value of e. Suppose MaxCut(G) = 1 — e. The procedure Simple' runs 
SiMPLE(G,e r ,/x, 1) (i.e. a = 1), for all e r such that 1 — e r = (1 — and 1/2 < 1 — e r < 1. By 



choosing 7 small enough and Claim 5.2 below, we can ensure that we cut at least H (e, /j,) — o(l) 



fraction of edges. It therefore suffices to prove: 

Claim 5.2 If SlMPLE(G, s, fj,) succeeds, it outputs a cut of (fractional) value at least H(e,fi). If it 
fails, it outputs a cut of value 1/2. If MaxCut(G) > 1 — e, then Simple(G, e, /u) succeeds with 
high probability. The running time is always bounded by 0(Am 2+At ). 

We now describe Balance and state the main lemma associated with it. We observe that 
Balance uses CutOrBound to either decompose the graph into pieces, or ensure that we classify 



many vertices. We use Theorem 4.1 to bound the running time. 



Balance Input: Graph G. Parameters: e\, /ij, 82, ^2> ct = m T - 

1. Let P be a random subset of O(logn) vertices. 

2. For each vertex i G P, run CuTORBouND(i, £(ei, fii), a). 

3. If a low conductance set S was found by any of the above calls: 

(a) Let Gs be the induced graph on S, and G' be the induced graph on V \ S. Run 
Simple^Gs 1 , H2) and Balance(G') (with same parameters) to get the final partition. 

4. Run Simple(G, Si,fj,\, a) up to Step [4], using random vertex set P. Then run Balance (G') 
(with same parameters), where G' is the induced graph on the unclassified vertices. 

5. Output the better of this cut and the trivial cut. 



Lemma 5.3 For any constant b > 1.5, there is a choice of [12 and r so that Balance runs in 
0(Am b ) time and provides an approximation factor that is a constant greater than 0.5. 

Let us give a simple explanation for the 1.5-factor. Neglecting the ^'s and polylogarithmic 
factors, we perform 0(1/ a) walks in CutOrBound. In the worst case, we could get a low con- 
ductance set of constant size, in which case the work per output is 0(1 /a). When we have the 
a bound on probabilities, the work per output is O(am). So it appears that a = Ij^fm is the 
balancing point, which yields an 0(m 15 ) time algorithm. 

In the next subsection, we define many parameters which will be central to our analysis. We 



then provide detailed proofs for Claim 5.2 and Lemma |5.3| Finally, we give a graph detailing how 



the approximation factor increases with running time (for both Simple and Balance). 
5.1 Preliminaries 

For convenience, we list the various free parameters and dependent variables. 
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e is the maxcut parameter, as described above. Eventually, this will be set to some constant 
(this is explained in more detail later). 

ji is a running time parameter. This is used to control the norm of the y~i vector, and through 



that, the running time. This affects the approximation factor obtained, through Lemma 3.7 
a(= m~ T ) is the maximum probability parameter. This directly affects the running time 
through Lemma |3.5| For Simple, this is just set to 1, so it only plays a role in Balance. 
£(e,fi) := fi(ln(Am/5 2 ) /[2(5 + e)]. This is the length of the random walk. 



a(e,fj) is the parameter that is in Lemma 3.5 Setting e' = — ln(l — e)//i, we get 1 — a 
e -'(l- e )(l-«5)(l- 7 ). 



x(s,^, a) is the cut parameter that comes from Theorem 4.1 When we get a set S of 



low conductance, the number of edges in the cut is at most x( e > (JL)\Internal(S)\ ■ Here, 



Internal(S) is the set of edges internal to S. In Theorem 4.1, the number of cut edges in 
stated in terms of the conductance <f>. We have x = 40/(1 — 2(f>). Also, <fi is at most y^AerJJi. 
We will drop the dependence on a, since it will be fixed (more details given later). 

We will also use some properties of the function H(e,fi). 

Lemma 5.4 For any fixed /U > ; H(e,fi) is a convex, decreasing function of e. Furthermore, there 
is a value e = e(fi) such that H(e,fi) > 0.5029. 

Proof: First, note that f(o~) is a decreasing function of a. This is because all the three functions 
that define / are decreasing in their respective ranges, and the transition from one function to the 
next occurs precisely at the point where the functions are equal. 

Now, for any fixed /i, a(e,fi) is a strictly increasing function of e, and hence, f(a(e,fi)) is a 
decreasing function of e. Thus, H(e,fi) = L f(a(s/r, fj))dr is a decreasing function of e, since for 
any fixed r, the integrand f(a(e/r, /j,)) is a decreasing function of e. 

For convenience of notation, we will use H and a to refer H(e, jj) and a(e, fx) respectively. Now 
define x = e/r. Doing this change of variables in the integral, we get H = e J £ °° W By the 

fundamental theorem of calculus, we get that 

dH _ [°° f(a(x,ri) dx f(a) 



de J £ x 2 e 
Again applying the fundamental theorem of calculus, we get that 

d 2 H f(a) e d 4f-f(o) 1 df(a) 



de 2 e 2 e 2 e de 



> 0, 



since f(o~) is a decreasing function of e. Thus, H is a convex function of e. 

To show the last part, let a" 1 is the inverse function of a(s,fi), keeping /i fixed, and consider 

e(fi) = o"~ 1 (l/4) = 1 — (|) — o(l), by making 5 and 7 small enough constants. For r 6 [1/4, 1/3], 
we have f(a(e/r,fi)) > /(1/4) > 0.535. Thus, we get 

H(e,fi) > 0.5 + 0.035 x (1/3 - 1/4) = 0.5029. 

□ 

5.2 Proof for Simple 



As we showed in the main body, it suffices to prove Claim 5.2 
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Proof: (of Claim 5.2 ) This closely follows the analysis given in |Tre09| and |Sot0 9|. If any recursive 
call to Simple fails, then the top level algorithm also fails and outputs the trivial cut. 
Suppose MaxCut(G) is at least 1 — e. Then MaxCut(G') is at least 

(1 — e)m — lnc(Everii, Oddi) ^ 
m — Inc(Everii, Oddi) 

Applying this inductively, we can argue that whenever a recursive call Simple(G', e' , p) is made, 



MaxCut(G') > 1 — e' . From Lemma 3.7 since O(logn) vertices are chosen in P, with high 



probability, in every recursive call, a good vertex is present in P. From Lemma 3.5, in every 
recursive call, with high probability, some call to Find-threshold succeeds. Hence, Simple will 
not output FAIL and succeeds. 

Assuming the success of Simple, let us compute the total number of edges cut. We denote the 
parameters of the ith recursive call to Simple by subscripts of t. Let the number of edges in Gt 
be ptm (where po = 1). Let T be the last call to Simple. We have St = s/pt- Only for t = T, we 
have that f(a(e/pt,p)) = 1/2. In the last round, we cut p^m/2 edges. The number of cut edges 
in other rounds is f(a(et, p))(pt — Pt+i) m - Summing over all t, the total number of edges cut (as 
a fraction of m) is 



Tl T2 „ pt rpT-i 

Yf(a(e u p))(p t - Pt+1 )+p T /2 = f{a{e/pt,p))dr+ f{a(e/ Pu p))dr + p T /2 

■ " t=0 Pt+t ^ Pt 

T— 9 

^ * rpt rpT-i 
2j / f{v(z/Pt,n))dr+ / f(a(e/p t ,p))dr 

*_n J 0*4-1 J Z* 



t=0 t=0 Pt+1 pT 

T-2 



t=0 J P t + 1 

(l/2)dr + p T /2 



'Pt 
T-2 



> E / f(^/r,fj,))dr+ / f(a(e/r,p))dr + z;/2 

t—n J Pt.4-1 J zT. 



t=0 J P*+l 

f(a(e/r,p))dr + z;/2 

The inequality comes about because / is a decreasing function and a is an increasing function of e. 



We now bound the running time, using Lemma 3.5 Consider a successful iteration t. Suppose 
the number of vertices classified in this iteration is Nt- The total running time in iteration t is 
0(NtAm 1+IM ). This is because we run the O(logn) calls in parallel, so the running time is at most 
O(logn) times the running time of the successful call. Summed over all iterations, this is at most 
0(Am 2+ ' i ). Suppose an iteration is unsuccessful, the total running time is 0(Am 2+At ). There can 
only be one such iteration, and the claimed bound follows. □ 

5.3 Proofs for Balance 

We first give a rather complicated expression for the approximation ratio of Balance. First, for 
any p > 0, define h(p) = min £ H ^ e ^ . This is essentially the approximation factor of Simple'. 

Claim 5.5 The algorithm Balance has a work to output ratio of 0(A(m T+fl2T + m 1+/il_r )). The 
approximation ratio is at least: 

■ \ ■ / 1 M/U2)(l-£-£X(£l,Ml)) + X(£l,/"l)/2 1 rjt s 1 ] 

max mm < mm max < — , — — > , H (s\ , p\ ) , 



\e \2(i- e y (i-s)(i + x(si,pi)) y V1 '^'2(i- £l 
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Proof: First let us analyze the work per output ratio of Balance. We initially perform 0(Am T ) 
walks. Suppose we get a low conductance set S. We then run Simple(Gs, £2, ^2)- Here, the work 
to output ratio is at most 0(Am T+ ^ 2T ). If we get a tripartition, the work to output ratio is at most 
0(Ara 1+ ' 11 " T ). Adding these, we get an upper bound on the total work to output ratio. 



Because we choose a random subset P of size O(logn), we will assume that Lemma 5.1 and 



Claim 5.2 hold (without any error). To analyze the approximation ratio, we follow the progress 
of the algorithm to the end. In each iteration, either a low conductance set is removed, or the 
basic algorithm is run. In each iteration, let us consider the set of vertices this is assigned to some 
side of the final cut. In case of a low conductance set, we get a cut for the whole set. Otherwise, 
if we get a tripartition, the union Everii U Oddi will be this set. If we do not get a tripartition, 
then we output the trivial cut (thereby classifying all remaining vertices). Let us number the low 
conductance sets as Si, S2, • • • • The others are denoted Ti, T^, ■ ■ ■ ,Tf. We will partition the edges 
of G into parts, defining subgraphs. The subgraph Gg consists of all edges incident to some Si. 
The remaining edges form Gt- The edges of Gs are further partitioned into two sets: G c is the 
subgraph of cross edges, which have only one endpoint in S. The other edges make the subgraph 
G' s . The edge sets of these subgraphs are Eg, Et, E c , E' s , respectively. For any set Si, G\g t denotes 
the induced subgraph on Si. 

We now count the number of edges in each set that our algorithm cuts. We can only guarantee 
that half the edges in E c are cut. Let the MaxCut of G\g i be MaxCut(G|5 ! ). Our algorithm will 
cut (in each S%) at least /i(^2)MaxCut(G|s'J edges. This deals with all the edg 6s in Eg . In -Ery ? 
we can only cut half of the edges. In we cut an H(ei,fii) fraction of edges. In total, 

^/ 1 ( / u 2 )MaxCut( < S j ) + (1/2)\E C \ +Y,H{e 1 ,n 1 )\E T .\ + (l/2)\E Tf \ 

« 3 

The maxcut of G\t f is at most (1 — si) (otherwise, we would get a tripartition). So we get, 
Y,H{ei,m)\E Tj \ + {l/2)\E Tf \ > Y, H{ei,^i)\E T . | + ]_ g MaxCiit^)^ | 

3 3 1 

> min(ff( £l , Ml ), — — )MAxCuT(r) 

Z[l Ei) 

By definition, \E C \ < xi^i, Hi)\E' s \. Fixing the size of E C L)E' S , we minimize the number of edges 
cut by taking this to be equality. Consider the subgraph Gg and let its MaxCut value be 1 — e. 
If we remove the edges E c , we get the subgraph G' s . The MaxCut of G' s is at least 

e l-e-x(ei,Mi) 



l-x(ei>Mi) l-x(ei,Mi) 
Now, we lower bound the total number of edges in G\ that are cut. 

^/ 1 (/x 2 )MaxCut(G| 5 J + (l/2)\E c \ > ^ 2 )^MAxCuT(G| Si ) + (1/2)1^1 

i i 

> h(fi 2 )MAxCvT(G' s ) + (l/2) x (£i,Mi)|£sl 

> (mm 2 ) 1 : £ ~ • x } £1 ; t ^ + (i/2Mei, mo) i^i 

By definition of e, 

MaxCut(Gs) = (1 - e)\E s \ = (1 - e)(l + x (ei, Vi))\E' s \ 
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The total number of edges cut is bounded below by: 



X>(ei, W )|i^| + (1/2)|^/| 

3 

1 /i(/i 2 )(l-e-x(ei,w)) 



> rnin I h 

e \2(l-eY (l- £ )(l- X (£i,Mi) 2 ) 2(l-e)(l + X (ei,Mi)) 



MaxCut(Gs) 



+ min (H(ei,fii) 



1 



2(1 -ei 



)MaxCut(G t ) 



□ 



Using this we prove the main lemma about Balance (restated here for convenience): 



Lemma 5.6 For any constant b > 1.5, there is a choice of fix, fj>2 and t so that there is an 0(Am b ) 
time algorithm with an approximation factor that is a constant greater than 0.5. 

Proof: The algorithm Balance has a work to output ratio of 0(A(m T+M2T + m 1+Ml ~ T )). We 
now set [i\ and ^2 to be constants so that the work to output ratio is 6 — 1. For this, we set 
r + fj,2T = 1 + Ml — t = b — 1. Letting fi± > be a free parameter, this gives r = 2 + /ii — 6, and 
M2 = 2 2+^-b • Note that since 6 > 1.5, we can choose fii > so that r > and [12 > 0. 

Now, it remains to show that for any choice of Ml> M2 > 0, the bound on the approximation fac- 



tor given by Claim 5.5 is greater than 0.5. For convenience of notation, we will drop the arguments 
to functions and use h, H, and x to refer to h(/j,2), H (e\, fix), and x( £ ii Ml) respectively. First, note 
that h > 0.5. Let us set E\ = e(mi) as from the statement of Lemma |5.4[ Then H > 0.5029, and 



1 > 0.5 since E\ > 0. Furthermore, note that min e max j 2 (i- e ) > ^ \il e )(i+^ 2 } ^ s obtained at 

> 0.5 since h > 0.5. Thus, the minimum of all these three 



2(l-ei) 
, 2h 



and takes the value 



h+h X 



2h{l+ X ) ' allu - ba ^° VCX1UC 1+2/ix 

quantities is greater than 0.5, and hence the approximation factor is more than 0.5. □ 



Using a more nuanced analysis of the approximation ratio, we can get better bounds. This 



requires the solving of an optimization problem, as opposed to Claim 5.5 We provided the weaker 
claim because it is easier to use for Lemma 15.31 



Claim 5.7 Let us fix Ml> M2- The approximation ratio can be bounded as follows: let e' s , X, Y, Z 
variables and e, e\ be fixed. First minimize the function: 



1 



1 - e 



(H(s' s , /i 2 ) + X (£i, Hi)/2)X + H(e u + 



Z 



with constraints: 



e' s X + eiZ < e 
(l + X (£i,»l))X + Y + Z = l 
< e' s < 1/2 
< X, Y, Z < 1 

Let this value by OBJ(e,ei). The approximation ratio is at least 



maxminmax[l/(2(l — e)), OBJ(e, £\) 

£1 e 
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Proof: To analyze the approximation ratio, we follow the progress of the algorithm to the end. 
In each iteration, either a low conductance set is removed, or the basic algorithm is run. In each 
iteration, let us consider the set of vertices this is assigned to some side of the final cut. In case of 
a low conductance set, we get a cut for the whole set. Otherwise, if we get a tripartition, the union 
V^i U V~ r will be this set. If we do not get a tripartition, then we output the trivial cut (thereby 
classifying all remaining vertices). Let us number the low conductance sets as S±, S 2 , ■ ■ ■ ■ The 
others are denoted T\, T 2 , ■ ■ ■ ,Tf. We will partition the edges of G into parts, defining subgraphs. 
The subgraph Gs consists of all edges incident to some S{. The remaining edges form Gt- The 
edges of Gs are further partitioned into two sets: G c is the subgraph of cross edges, which have 
only one endpoint in S. The other edges make the subgraph G' s . In Gt, let the edges incident 
to vertices not in Tf be be G' T . The remaining edges form the subgraph Gf. The edge sets of 
these subgraphs are E$, Et, E c , E' s , Ef, E' T , respectively. For any set Si, G|s i denotes the induced 
subgraph on Si. 

We now count the number of edges in each set that our algorithm cuts. We can only guarantee 
that half the edges in E c are cut. Let the MaxCut of G\s z be MaxCut^GIsJ (= n). Our 
algorithm will cut (in each Si) at least H{ji, /^I-Egj edges. This deals with all the edges in Es- 
In Et s , we can only cut half of the edges. In Et 3 , we cut an H(e\,^i) fraction of edges. In total, 

Y J H(T i , l x 2 )\Es i \ + {l/2)\E c \ + Y J H{e 1 ^ 1 )\E Tj \ + {l/2)\E T} \ 
» 3 

By convexity of H, we have J2i H(n, fi 2 ) > H(e' s , fi 2 )\E' s \, where MaxCut(G' 5 ) = 1 - e' s . Putting 
it all together, we cut at least 

H(e' s , f , 2 )\E' s \+H(e 1 , f , 1 )\E T \ + (1/2)1^1 + (l/2)\E c \ 

We would like to find out the minimum value this can attain, for a given e±. The parameters H\,n 2 
are fixed. The maxcut of Gf is at most (1 — £\) (otherwise, we would get a tripartition). We have 
the following constraints: 

\E C \ < X (ei, fn)\E' s \ 
e's\E's\+ef\E f \<em 
\E' S \ + \E' T \ + \Ef\ + \E C \ = m 
ei < £f < 1/2 

For a given size of E' s , we should maximize E c to cut the least number of edges. So we can assume 
that \E C \ = x{ £ ii ^i)W s \- Let us set X := \E' s \/m, Y := \E' T \/m, and Z := \Ef\/m. Consider 
fixing e and e\. The variables are e' s ,Ef,X,Y,Z. This means the approximation ratio is at least 
the minimum of 

(H(e' s , /i 2 ) + X (ei, tn)/2)X + H{e u m)Y + | 
under the constraints: 

e' s X + e f Z < e 
(l + x(£i,»i))X + Y + Z = l 
£1 < £f < 1/2 < e's < 1/2 

< X, Y, Z < 1 

Let OBJ(e,£i) be the minimum value attained. We observe that given any solution, the objective 
can be decreased if we decrease ej. This is because for a small decrease in Ef, we can increase Z 



1 

1 -e ' 
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(and decrease either X or Y). This preserves all the constraints, but decreases the objective. So 
we can set et = e\. Our bound on the approximation ratio is 

maxminmax[l/(2(l — e)), OBJ(e, e±)] 

£l £ 

□ 

5.4 Running Time/ Approximation Ratio Tradeoff 




2.2 2.4 

Running time = 0(n x ) 



Figure 1: Running Time/ Approximation Ratio Tradeoff Curve for Simple and Balance. Simple 
needs running time 0(n 2+/i ) and BALANCE needs running time 0(n L5+/ ^, for any constant /i > 0. 



The approximation ratio for Simple is from Lemma 5.1, and that for Balance is from Claim 5.7 



6 Conclusions and Further Work 

Our combinatorial algorithm is very natural and simple, and beats the 0.5 barrier for MaxCut. 
The current bounds for the approximation ratio we get for, say, quadratic time are quite far from 
the optimal Goemans- Williamson 0.878, or even from Soto's 0.6142 bound for Trevisan's algorithm. 
The approximation ratio of our algorithm can probably be improved, and it might be possible to 
get a better running time. This would probably require newer analyses of Trevisan's algorithm, 
similar in spirit to Soto's work |Sot09j . It would be interesting to see if some other techniques 
different from random walks can be used for MaxCut. 

This algorithm naturally suggests whether a similar approach can be used for other 2-CSPs. 
We believe that this should be possible, and it would provide a nice framework for combinatorial 
algorithms for such CSPs. On a different note, our local partitioning algorithm raises very inter- 
esting questions. Can we get such a partitioning procedure that has a better work to output ratio 
(close to polylogarithmic) but does not lose the ylogn factor in the conductance (which previous 
algorithms lose)? We currently have a work to output that can be made close to ^Jn in the worst 
case. A significant improvement would be of great interest. 
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